Emoji Could be Joined

Date: 2023-03-04

Apparently, this โ€˜๐Ÿ‘จโ€๐Ÿ‘ฉโ€๐Ÿ‘ฆโ€™ or Family: Man, Woman, Boy emoji is consisted of multiple emojis that are combined or joined.

$ node
Welcome to Node.js v19.7.0.
Type ".help" for more information.
> [..."๐Ÿ‘จโ€๐Ÿ‘ฉโ€๐Ÿ‘ฆ"]
[ '๐Ÿ‘จ', 'โ€', '๐Ÿ‘ฉ', 'โ€', '๐Ÿ‘ฆ' ]

or the other way around:

> `๐Ÿ‘จ\u{200D}๐Ÿ‘ฉ\u{200D}๐Ÿ‘ฆ`
'๐Ÿ‘จโ€๐Ÿ‘ฉโ€๐Ÿ‘ฆ'

Note: You might or might not seeing the family emoji as a single character glyph when you run the above code on your terminal. In order to render this emoji correctly, you have to use fonts that support emoji (like Google Noto font) and using a terminal that has support for rendering multi-width cell Unicode character (like kitty).

Huh. I thought all this time that every Unicode characters (including emojis) are represented by their unique code (which is true, but I thought it more like ascii characters). Today I learned that they can be joined or concatenated.

If you take a look back at the above example. The emoji โ€˜๐Ÿ‘จโ€๐Ÿ‘ฉโ€๐Ÿ‘ฆโ€™ consists of 5 other Unicode characters or code point, which 3 of them is the Man emoji (๐Ÿ‘จ), Woman emoji (๐Ÿ‘ฉ), and the Boy emoji (๐Ÿ‘ฆ). When you reconstruct the 3 emojis into one like in the second code snippet, itโ€™ll return the family emoji back.

So what is that empty char ('โ€') then? What is this \u{200D} (or 0x200D) thing?

Both of those are called ZWJ or Zero Width Joiner. This invisible character act as glue or joiner for emoji sequence to create another emoji. AFAICT, this ZWJ sole purpose is to join a sequence of emoji and it is not required if you want to join non-emoji Unicode characters like this one: eฬ (Latin Small Letter E followed by Combining Acute Accent)

> [...'eฬ']
[ 'e', 'ฬ' ]

Right then. Letโ€™s play around with combining some emojis!

Suppose you have this โ€˜๐Ÿง”โ€™ or Person with Beard emoji and you want to create the male version of it (yes, this โ€œPerson with Beardโ€ emoji is supposed to be gender neutral). I could guess what some of you might think right now: โ€œLetโ€™s join this ๐Ÿง” emoji with the ๐Ÿ‘จ or Man emoji!โ€.

> `๐Ÿง”\u{200D}๐Ÿ‘จ`
'๐Ÿง”โ€๐Ÿ‘จ'

Andโ€ฆ it didnโ€™t work? It just print both emoji characters side-by-side. Not this single ๐Ÿง”โ€โ™‚๏ธ or Man with a Beard emoji we expected to see.

Okay letโ€™s cheat this time.

> [...'๐Ÿง”โ€โ™‚๏ธ']
[ '๐Ÿง”', 'โ€', 'โ™‚', '๏ธ' ]

Uh oh! So youโ€™re not joining it with the โ€œManโ€ emoji, instead you use the โ€˜โ™‚๏ธโ€™ or Male Sign emoji. But, why another ZWJ in the end?

> for (const c of "๐Ÿง”โ€โ™‚๏ธ") { console.log(`0x${c.codePointAt(0).toString(16)}`) }
0x1f9d4
0x200d
0x2642
0xfe0f

It turns out that last character is not a ZWJ! It is Variation Selector-16 (U+FE0F). This another invisible character is used to specifies that the preceding character should be displayed in the emoji presentation, instead of its default text-mode display.

We could then use the above information to createโ€ฆ the female version of โ€œPerson with Beardโ€ emoji.

> `๐Ÿง”\u{200D}โ™€\u{FE0F}`
'๐Ÿง”โ€โ™€๏ธ'

Want to add a skin tone? Follows the โ€œPerson with Beardโ€ emoji with the skin tone of your choice without using ZWJ. Letโ€™s pick this โ€œU+1F3FBโ€ code point or Light Skin Tone as example.

> `๐Ÿง”\u{1F3FB}\u{200D}โ™€\u{FE0F}`
'๐Ÿง”๐Ÿปโ€โ™€๏ธ'

Why using code point this time?

Because we arenโ€™t using ZWJ to join the skin tone, the browser will read the โ€œPerson with Beardโ€ emoji followed by the โ€œLight Skin Toneโ€ emoji as a valid sequence and then will be rendered as single glyph like this:

> `๐Ÿง”๐Ÿป\u{200D}โ™€\u{FE0F}`
'๐Ÿง”๐Ÿปโ€โ™€๏ธ'

Using code point will show you clearly that we appending the skin tone emoji (or code point) after the person emoji.

It will be fun to do some โ€˜Guess the Codepointโ€™ game I think?

Letโ€™s start from the easiest one.

Guess the Codepoint!

  1. โค๏ธโ€๐Ÿ”ฅ(Heart on Fire)?

    Show me

    Yep it is the Red Heart and Fire emoji.

    > [...'โค๏ธโ€๐Ÿ”ฅ']
    [ 'โค', '๏ธ', 'โ€', '๐Ÿ”ฅ' ]
    
  2. ๐Ÿ˜ฎโ€๐Ÿ’จ (Face Exhaling)?

    Show me

    It is Face with Open Mouth and Dashing away.

    > [...'๐Ÿ˜ฎโ€๐Ÿ’จ']
    [ '๐Ÿ˜ฎ', 'โ€', '๐Ÿ’จ' ]
    

    You need to have your mouth open to be able to exhale, right?

  3. ๐Ÿ˜ตโ€๐Ÿ’ซ (Face with Spiral Eyes)?

    Show me

    It is Face with Crossed-Out Eyes and Dizzy.

    > [...'๐Ÿ˜ตโ€๐Ÿ’ซ']
    [ '๐Ÿ˜ต', 'โ€', '๐Ÿ’ซ' ]
    

    I donโ€™t know the differences between โ€˜xโ€™ eyes and spiral eyes symbolically, but both are similarly used to represent dizziness (in this emoji context).

  4. ๐Ÿ‘จโ€๐Ÿผ (Man Feeding Baby)?

    Show me

    It is Man and Baby Bottle. (Huh, where is the baby??)

    > [...'๐Ÿ‘จโ€๐Ÿผ']
    [ '๐Ÿ‘จ', 'โ€', '๐Ÿผ' ]
    

    Also, hope we can get Man breast-feeding soon!

  5. ๐Ÿปโ€โ„๏ธ (Polar Bear)?

    Show me

    Yep. It is a Bear sprinkled with Snowflake.

    > [...'๐Ÿปโ€โ„๏ธ']
    [ '๐Ÿป', 'โ€', 'โ„', '๏ธ' ]
    



Ref: https://fasterthanli.me/articles/the-bottom-emoji-breaks-rust-analyzer